Hurricane Ike Maximum Water Levels

Compute maximum water levels during Hurricane Ike on a 9 million node triangular mesh ADCIRC storm surge model. Visualize the results using HoloViz TriMesh rendering with Datashader.

import fsspec
import numpy as np
import pandas as pd
import xarray as xr

Start a dask cluster to crunch the data

from dask.distributed import Client
from dask_gateway import Gateway

gateway = Gateway()
cluster = gateway.new_cluster()
# from dask_kubernetes import KubeCluster
# cluster = KubeCluster()
cluster
cluster.adapt(minimum=4, maximum=20);
client = Client(cluster)

Read the data using the cloud-friendly zarr data format

ds = xr.open_zarr(
    fsspec.get_mapper('s3://pangeo-data-uswest2/esip/adcirc/ike', anon=False, requester_pays=True)
)
# ds = xr.open_zarr(fsspec.get_mapper('gcs://pangeo-data/rsignell/adcirc_test01'))
ds['zeta']
<xarray.DataArray 'zeta' (time: 720, node: 9228245)>
dask.array<zarr, shape=(720, 9228245), dtype=float64, chunksize=(10, 141973), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2008-09-05T12:00:00 ... 2008-09-10T11:50:00
    x        (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray>
    y        (node) float64 dask.array<chunksize=(141973,), meta=np.ndarray>
Dimensions without coordinates: node
Attributes:
    location:       node
    long_name:      water surface elevation above geoid
    mesh:           adcirc_mesh
    standard_name:  sea_surface_height_above_geoid
    units:          m

How many GB of sea surface height data do we have?

ds['zeta'].nbytes / 1.0e9
53.1546912

Take the maximum over the time dimension and persist the data on the workers to use later. This is the computationally intensive step.

max_var = ds['zeta'].max(dim='time').persist()

Visualize data on mesh using HoloViz.org tools

import cartopy.crs as ccrs
import datashader as dshade
import geoviews as gv
import holoviews as hv
import holoviews.operation.datashader as dshade
import hvplot.xarray
import numpy as np

dshade.datashade.precompute = True
hv.extension('bokeh')
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/element.py:74: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/core/layout.py:225: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if key is ():
/srv/conda/envs/notebook/lib/python3.8/site-packages/holoviews/element/tabular.py:60: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if heading is ():
v = np.vstack((ds['x'], ds['y'], max_var)).T
verts = pd.DataFrame(v, columns=['x', 'y', 'vmax'])
points = gv.operation.project_points(gv.Points(verts, vdims=['vmax']))
tris = pd.DataFrame(ds['element'].values.astype('int') - 1, columns=['v0', 'v1', 'v2'])
tiles = gv.tile_sources.OSM
value = 'max water level'
label = f'{value} (m)'
trimesh = gv.TriMesh((tris, points), label=label)
mesh = dshade.rasterize(trimesh).opts(cmap='rainbow', colorbar=True, width=600, height=400)
tiles * mesh

Extract a time series at a specified lon, lat location

Because Xarray does not yet understand that x and y are coordinate variables on this triangular mesh, we create our own simple function to find the closest point. If we had a lot of these, we could use a more fancy tree algorithm.

# find the indices of the points in (x,y) closest to the points in (xi,yi)
def nearxy(x, y, xi, yi):
    ind = np.ones(len(xi), dtype=int)
    for i in range(len(xi)):
        dist = np.sqrt((x - xi[i]) ** 2 + (y - yi[i]) ** 2)
        ind[i] = dist.argmin()
    return ind
# just offshore of Galveston
lat = 29.2329856
lon = -95.1535041
ind = nearxy(ds['x'].values, ds['y'].values, [lon], [lat])
ds['zeta'][:, ind].hvplot(x='time', grid=True)